An Empirical Comparison of Weighting Functions for Multi-label Distance- Weighted K-nearest Neighbour Method

نویسنده

  • Jianhua Xu
چکیده

Multi-label classification is an extension of classical multi-class one, where any instance can be associated with several classes simultaneously and thus the classes are no longer mutually exclusive. It was experimentally shown that the distance-weighted k-nearest neighbour (DWkNN) algorithm is superior to the original kNN rule for multi-class learning. But, it has not been investigated whether the distance-weighted strategy is valid for multi-label learning and which weighting function performs well. In this paper, we provide a concise multi-label DWkNN form (MLC-DWkNN). Furthermore, four weighting functions, Dudani’s linear function varying from 1 to 0, Macleod’s linear function ranging from 1 to 1/2, Dudani’s inverse distance function, and Zavrel’s exponential function, are collected and then investigated by detailed experiments on three benchmark data sets with Manhattan distance. Our study demonstrates that Dudani’s linear and Zavrel’s exponential functions work well, and moreover MLC-DWkNN with such two functions outperforms an existing kNN-based multi-label classifier ML-kNN.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Class-Based Attribute Weighting for Time Series Classification

In this paper, we present two novel class-based weighting methods for the Euclidean nearest neighbor algorithm and compare them with global weighting methods considering empirical results on a widely accepted time series classification benchmark dataset. Our methods provide higher accuracy than every global weighting in nearly half of the cases and they have better overall performance. We concl...

متن کامل

Evolutionary feature weighting to improve the performance of multi-label lazy algorithms

In the last decade several modern applications where the examples belong to more than one label at a time have attracted the attention of research into machine learning. Several derivatives of the k-nearest neighbours classifier to deal with multi-label data have been proposed. A k-nearest neighbours classifier has a high dependency with respect to the definition of a distance function, which i...

متن کامل

How k-Nearest Neighbor Parameters Affect its Performance

The k-Nearest Neighbor is one of the simplest Machine Learning algorithms. Besides its simplicity, k-Nearest Neighbor is a widely used technique, being successfully applied in a large number of domains. In k-Nearest Neighbor, a database is searched for the most similar elements to a given query element, with similarity defined by a distance function. In this work, we are most interested in the ...

متن کامل

An efficient weighted nearest neighbour classifier using vertical data representation

The k-nearest neighbour (KNN) technique is a simple yet effective method for classification. In this paper, we propose an efficient weighted nearest neighbour classification algorithm, called PINE, using vertical data representation. A metric called HOBBit is used as the distance metric. The PINE algorithm applies a Gaussian podium function to set weights to different neighbours. We compare PIN...

متن کامل

Optimal weighted nearest neighbour classifiers

We derive an asymptotic expansion for the excess risk (regret) of a weighted nearest-neighbour classifier. This allows us to find the asymptotically optimal vector of nonnegative weights, which has a rather simple form. We show that the ratio of the regret of this classifier to that of an unweighted k-nearest neighbour classifier depends asymptotically only on the dimension d of the feature vec...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011